On Extreme Pruning of Random Forest Ensembles for Real-time Predictive Applications
نویسندگان
چکیده
Random Forest (RF) is an ensemble supervised machine learning technique that was developed by Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe that there is still room for enhancing and improving its performance accuracy. This explains why, over the past decade, there have been many extensions of RF where each extension employed a variety of techniques and strategies to improve certain aspect(s) of RF. Since it has been proven empirically that ensembles tend to yield better results when there is a significant diversity among the constituent models, the objective of this paper is twofold. First, it investigates how data clustering (a well known diversity technique) can be applied to identify groups of similar decision trees in an RF in order to eliminate redundant trees by selecting a representative from each group (cluster). Second, these likely diverse representatives are then used to produce an extension of RF termed CLUB-DRF that is much smaller in size than RF, and yet performs at least as good as RF, and mostly exhibits higher performance in terms of accuracy. The latter refers to a known technique called ensemble pruning. Experimental results on 15 real datasets from the UCI repository prove the superiority of our proposed extension over the traditional RF. Most of our experiments achieved at least 95% or above pruning level while retaining or outperforming the RF accuracy.
منابع مشابه
An Outlier Detection-based Tree Selection Approach to Extreme Pruning of Random Forests
Random Forest (RF) is an ensemble classification technique that was developed by Breiman over a decade ago. Compared with other ensemble techniques, it has proved its accuracy and superiority. Many researchers, however, believe that there is still room for enhancing and improving its performance in terms of predictive accuracy. This explains why, over the past decade, there have been many exten...
متن کاملRelevant Ensemble of Trees
Tree ensembles are flexible predictive models that can capture relevant variables and to some extent their interactions in a compact and interpretable manner. Most algorithms for obtaining tree ensembles are based on versions of boosting or Random Forest. Previous work showed that boosting algorithms exhibit a cyclic behavior of selecting the same tree again and again due to the way the loss is...
متن کاملAdaptive Predictive Controllers Using a Growing and Pruning RBF Neural Network
An adaptive version of growing and pruning RBF neural network has been used to predict the system output and implement Linear Model-Based Predictive Controller (LMPC) and Non-linear Model-based Predictive Controller (NMPC) strategies. A radial-basis neural network with growing and pruning capabilities is introduced to carry out on-line model identification.An Unscented Kal...
متن کاملMondrian Forests: Efficient Online Random Forests
Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as ...
متن کاملRoot Attribute Behavior within a Random Forest
Random Forest is a computationally efficient technique that can operate quickly over large datasets. It has been used in many recent research projects and real-world applications in diverse domains. However, the associated literature provides few information about what happens in the trees within a Random Forest. The research reported here analyzes the frequency that an attribute appears in the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1503.04996 شماره
صفحات -
تاریخ انتشار 2015